Analysis and classification of speech mode: whispered through shouted

نویسندگان

Chi Zhang

John H. L. Hansen

چکیده

Variation in vocal effort represents one of the most challenging problems in maintaining speech system performance for coding, speech and speaker recognition. Changes in vocal effort (or mode) result in a fundamental change in speech production which is not simply a change in volume. This is the first study to collectively consider the five speech modes: whispered, soft, neutral, loud and shouted. After corpus development, analysis is performed for i) sound intensity level, ii) duration and silence percentage, iii) frame energy distribution and iv) spectral tilt. The analysis shows vocal effort dependent traits which are used to investigate speaker recognition. Matched vocal mode conditions result in a closed-set speaker ID rate of 97.62%, with mismatch vocal conditions producing 54.02%. Finally, a speech mode classification system is developed, which has a range of classification rate from 44.5% to 98.5% confusing with adjacent vocal modes. These advancements can provide improved speech/speaker modeling information, as well as classified vocal mode knowledge to improve speech and language technology in real scenarios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams

Whispered speech is an alternative speech production mode from neutral speech, which is used by talkers intentionally in natural conversational scenarios to protect privacy and to avoid certain content from being overheard or made public. Due to the profound differences between whispered and neutral speech in vocal excitation and vocal tract function, the performance of automatic speaker identi...

متن کامل

Cries and Whispers - Classification of Vocal Effort in Expressive Speech

The expansion of the video games industry raises innovative and challenging issues for speech technologies, e.g. the development of automatic content-based speech processing and speech recognition systems in the context of video games postproduction and voice casting. This paper presents a large-scale study on the classification of vocal effort in expressive speech for video games. Changes in v...

متن کامل

Effects of Within-Talker Variability on Speech Intelligibility in Mandarin-Speaking Adult and Pediatric Cochlear Implant Patients

Cochlear implant (CI) speech performance is typically evaluated using well-enunciated speech produced at a normal rate by a single talker. CI users often have greater difficulty with variations in speech production encountered in everyday listening. Within a single talker, speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context....

متن کامل

A discriminative analysis within and across voiced and unvoiced consonants in neutral and whispered speech in multiple indian languages

Whispered speech lacks the vocal chord vibration which is typically used to distinguish voiced and unvoiced consonants, making their discrimination a challenging task. In this work, we objectively and subjectively quantify the amount of discrimination between a voiced (V) consonant and its unvoiced (UV) counterpart using seven V-UV consonant pairs in six Indian languages, in neutral and whisper...

متن کامل

A comprehensive vowel space for whispered speech.

Whispered speech is a relatively common form of communications, used primarily to selectively exclude or include potential listeners from hearing a spoken message. Despite the everyday nature of whispering, and its undoubted usefulness in vocal communications, whispers have received relatively little research effort to date, apart from some studies analyzing the main whispered vowels and some q...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Analysis and classification of speech mode: whispered through shouted

نویسندگان

چکیده

منابع مشابه

Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams

Cries and Whispers - Classification of Vocal Effort in Expressive Speech

Effects of Within-Talker Variability on Speech Intelligibility in Mandarin-Speaking Adult and Pediatric Cochlear Implant Patients

A discriminative analysis within and across voiced and unvoiced consonants in neutral and whispered speech in multiple indian languages

A comprehensive vowel space for whispered speech.

عنوان ژورنال:

اشتراک گذاری